decision tree splitting criterion
The Simple Math behind 3 Decision Tree Splitting criterions
Decision Trees are great and are useful for a variety of tasks. They form the backbone of most of the best performing models in the industry like XGboost and Lightgbm. But how do they work exactly? In fact, this is one of the most asked questions in ML/DS interviews. We generally know they work in a stepwise manner and have a tree structure where we split a node using some feature on some criterion.
The Simple Math behind 3 Decision Tree Splitting criterions
Gini impurity is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset. In simple terms, Gini impurity is the measure of impurity in a node. So to understand the formula a little better, let us talk specifically about the binary case where we have nodes with only two classes. So in the below five examples of candidate nodes labelled A-E and with the distribution of positive and negative class shown, which is the ideal condition to be in? I reckon you would say A or E and you are right.